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The  Degrees  of  Freedom  of  the  X3-test  of  Dimensionality 

A.  M.  Kshirsagar 
Southern  Methodist  University 
Dallas,  Texas  75222 

1.  INTRODUCTI ON :  Wani  and  Kabe  (abbreviated  as  W  &  K  hereafter)  [2] 
have  recently  given  an  elegant  derivation  of  the  likelihood  ratio  cri¬ 
terion  for  testing  the  hypothesis  that  the  dimensionality  of  the 
space  of  the  means  of  k  p-variate  normal  populations  is  s  .  The  main 
difference  between  their  derivation  and  the  one  given  in  Rao  [1]  is 
that  Rao  uses  geometrical  terminology  while  W  £  K' s  derivation  is  com¬ 
pletely  analytical.  However,  their  proof  is  incomplete  without  the 
degrees  of  freedom  of  the  xa-test.  It  will  be  a  pity  to  be  required 
to  go  to  Rao's  geometrical  terminology  just  for  the  degrees  of  freedom 
(d.f.)  of  the  test.  The  object  of  this  note  is  therefore  to  derive  the 
number  of  d.f.  here  analytically  and  complete  the  W  &  K  derivation. 

2.  Degrees  of  Freedom  of  the  Likelihood  Ratio  Criterion: 

We  shall  use  the  same  notation  as  W  &  K  and  shall  not  reproduce 
it  here  to  economize  space.  The  number  of  d.f.  of  the  likelihood  ratio 
criterion  is  the  difference  between  the  number  of  parameters  with 
respect  to  (w.r.t.)  which  the  likelihood  L  is  maximized  in  the  entire 
parameter  space  Q  and  the  space  x,  restricted  by  the  hypothesis 
Hg ;  Hni  =  ?,  (i  =  1,  2,  ...,  k) 

given  by  equation  (1)  of  w  s  K.  The  number  of  parameters  in  Q  is  pk, 
the  p  means  of  each  of  the  k  populations.  Let  us  now  count  the  number 
of  parameters  estimated  while  deriving  max.  L  in  id.  W  &  K's  first 
step  in  maximizing  L  is  equivalent  to  making  a  transformation  from 
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r 


^(i  r  1,  2,  . .  .  k)  to 


'ill 

-K  J 


(1) 


where  K  is  an  r  *P  matrix  of  rank  r  =  p  -  s,  such  that  H£K'=  0,  so 
that  the  exponent  (apart  from  the  factor  h)  in  -,iven  by  (2)  of  W  &  K, 
reduces  to  (on  using  H„), 

k 

(§  -  H  Uf)'  ^  HEH'j"1  ~  H  U.) 

i  1 

+  T"n  -  K  -  K  .  (2) 

i  =  l 


This  is  then  minimized  w.r.t.  Ka^(i  =1,  . ..,  k)  first.  In  other  words. 


we  estimate  the  rk  parameters,  Ku,.  (i  -  1,  . .  .  k)  here.  The  second  term 
in  (2)  therefore  vanishes,  when  the  minimum  value  is  taken.  This  step 
is  hidden  in  W  f.  K.  Next  they  minimize  (2)  w.r.t.  the  unknown  £  ie  they 
estimate  a  further  s  parameters.  Finally,  they  minimize 


trf  (HEH1  )_:LHBH'  }, 


(3) 


where  B=N|  v  U.U.  -kUU 
\i=l  1  " 


. r.t.  the  unknown  H,  the  only  condition 


being  that  rank  Hiss.  One  may  think  here  that  the  additional  number 
of  parameter  estimated  in  this  is  p,  the  number  of  elements  of  H,  but  it 
is  not  true  because,  the  quantity  in  (3)  is  also  equal  to 

tr  H*'  (H*  I  H*')  H*  B,  (4) 

where  H*  AH  and  A  is  any  arbitrary  non-singular  sXs  matrix,  and  we 
can  choose  A  to  be  H  \  where 

H  lHi|Hn],  (5) 
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Hj  being  sxs,  H3  being  sX(p  -s  ).  Hj  can  be  assumed  to  be  non-singular 

without  loss  of  generality,  as  rank  ;i  - s  .  This  reduces  H*  to 

! I  ! Hj  _ 1H„  ]  (6) 

s  1 

which  has  only  s  (p  -s)  unknown  elements.  Thus,  the  number  of  unknown 

parameters  estimated  in  minimizing  (3)  is  only  s  (p  -s) .  The  total  number 

of  parameters  in  t  is  therefore 

rk  +  s  +  s  (p  -  s  )  (7) 

and  the  degrees  of  freedom  of  the  x~  test  are 

2 

pk  -  (rk  +  s  +  ps  -  s  )  =  (p  -  r)  (k  -  1  -  r)  .  (8) 

3.  Equation  of  the  r-dimensional  Flat: 

Rao  bases  his  derivation  on  the  fact  that  the  hypothesis  H0  is 
geometrically  equivalent  to  the  fact  that  the  k  points  (representing 
the  means  of  the  k  populations)  collapse  on  an  r-dimensional  flat  and  he 
then  proceeds  to  write  its  vectorial  equation.  Perhaps,  it  will  be 
instructive  to  demonstrate  this  analytically.  If  H0  is  true, 

HM  =  §  Elk,  (9) 

where  M  =  [u17  d.,,  . ..,  (10) 

and  L  ,  denotes  an  a  X  b  matrix,  with  all  unit  elements.  Hence  HM*  =  0, 
where  M*  -  M ( I  -  ~  Ekk* •  So  that  M*  is  °f  rank  p  -  s  -  r,  as  H  is  of 
ranks,  and  that  its  rank  cannot  be  improved  upon.  M*  has  therefore  r 
linearly  independent  column  vectors.  Let  us  denote  them  by 

(i  -  1,  2,  ...  r).  Hut,  it  is  easy  to  see  from  the  relationship  between 
M  and  M*  that  the  difference  between  any  two  columns  of  M  is  the  same  as 
the  difference  between  the  corresponding  columns  of  M*  and  so^ 
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-  Uj  +  (i^  column  of  M*  -  1st  column  of  M*) 

-  u,x  +  a  linear  combination  of  p*  (I  =  1,  2,  ...  r) 

This  is  the  vectorial  equation  of  the  r-dimensional  flat  which  Rao  uses 

and  is  determined  by  the  (r  +  1)  independent  points  pi  and 

H*  (i  =  1>  2>  • • r)  . 

Ji 

I  am  indebted  to  Dr.  J.  T.  Webster,  for  his  help  and  discussion. 
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